Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data
نویسندگان
چکیده
Knowledge about protein function is often encoded in the form of large and sparse undirected graphs where vertices are proteins and edges represent their functional relationships. One elementary task in the computational utilization of these networks is that of quantifying the density of edges, referred to as connectedness, inside a prescribed protein set. For instance, many functional modules can be identified because of their high connectedness. Since individual proteins can have very different numbers of interactions, a connectedness measure should be well-normalized for vertex degree. Namely, its distribution across random sets of vertices should not be affected when these sets are biased for hubs. We show that such degree-robustness can be achieved via an analytical framework based on a model of random graph with given expected degrees. We also introduce the concept of connectedness profile, which characterizes the relation between adjacency in a graph and a prescribed order of its vertices. A straightforward application to gene expression data and protein networks is the identification of tissue-specific functional modules or cellular processes perturbed in an experiment. The strength of the mapping between gene-expression score and interaction in the network is measured by the area of the connectedness profile. Deriving the distribution of this area under the random graph enables us to define degree-robust statistics that can be computed in O (M), M being the network size. These statistics can identify groups of microarray experiments that are pathway-coherent, and more generally, vertex attributes that relate to adjacency in a graph.
منابع مشابه
Construction and Analysis of Tissue-Specific Protein-Protein Interaction Networks in Humans
We have studied the changes in protein-protein interaction network of 38 different tissues of the human body. 123 gene expression samples from these tissues were used to construct human protein-protein interaction network. This network is then pruned using the gene expression samples of each tissue to construct different protein-protein interaction networks corresponding to different studied ti...
متن کاملUsing the Protein-protein Interaction Network to Identifying the Biomarkers in Evolution of the Oocyte
Background Oocyte maturity includes nuclear and cytoplasmic maturity, both of which are important for embryo fertilization. The development of oocyte is not limited to the period of follicular growth, and starts from the embryonic period and continues throughout life. In this study, for the purpose of evaluating the effect of the FSH hormone on the expression of genes, GEO access codes for this...
متن کاملMultivariate Feature Extraction for Prediction of Future Gene Expression Profile
Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...
متن کاملIdentification of diagnostic biomarkers by bioinformatics analysis in the inflamed and non-inflamed intestinal mucosa in Crohn\'s disease patients
Background: Crohn's disease (CD) is a type of inflammatory bowel disease (IBD) which despite the unknown details is generally related to genetic, immune system, and environmental factors. In this study, we identify transcriptional signatures in patients with CD and then explain the potential molecular mechanisms in inflamed and non-inflamed intestinal mucosa in these patients. Materials and Me...
متن کاملI-15: Assessment of Transcript and Protein Profiles of Infertile Individual May Help to Select Individuals with Low Fertilization Potential Candidate of Artificial Oocyte Activation
Background Following sperm penetration, oocyte is activated by sperm oocyte activating factors (SOAFs) released by sperm. Spermspecific phospholipase C isoform ζ (PLCζ) and post acrosomal WW binding protein (PAWP) are two nominees for the SOAF. PLCζ is located back-to-back with another testis-specific gene called CAPZA3. These two genes share a common bidirectional promoter. In this study we as...
متن کاملMultivariate Feature Extraction for Prediction of Future Gene Expression Profile
Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...
متن کامل